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4 : Measurement of Exposure and 
Outcome in Epidemiological 
Studies used for Quantitative 
Estimation and Prediction of Risk 


j Bruce Armstrong and Paolo Boffetta 


i 

! 4.1 Introduction 

j Making the right measurements of the expo- 
; sure (including agents confounded with it and 
' modifiers of its effect) and of the outcome of inter- 
j est (usually cancer, a preneoplastic lesion or 
; another intermediate step between exposure and 
; cancer), and making them accurately, are crucial 
■ to the valid quantitative estimation and predic- 
; tion (QEP) of cancer risk. These requirements are 
the two sides of the same coin: if the right mea- 
| suiements are not made, the consequences for 
QEP will be the same as if inaccurate measure¬ 
ments are made: die dose-response relationship 
’ will be biased and predictions made from it may 
be far from the truth. They are both errors in mea¬ 
surement. 

| Theoretically, error in measurement of expo- 
| sure and outcome will have similar effects on the 
dose-response relationship. In the simplest kind 
; of study, investigating the association between a 
binary exposure variable and a binary outcome 
I variable in the absence of confounding or other 
intervening variables, error in measurement of 
either the exposure or the outcome variable has 
the same effect on the estimate of association 
between exposure and outcome. In practice, how- 
' ever, the measurement of exposure is usually more 
' complex than that of outcome, mainly because of 
; the problem of measuring past exposures, the 
need to measure exposure to several agents (the 
agent of interest, confounders and effect modi¬ 
fiers), and the multiplicity of methods available to 
measure exposure. As a result, greater attention is 
usually paid to the correct measurement of expo¬ 
sure than to that of outcome. In addition, the 
study of the effect of exposure measurement error 
has presented more, interesting theoretical prob¬ 


lems than that of outcome measurement error, 
and has received more attention in the relevant 
literature. 

In principle, error in the measurement of expo¬ 
sure is a concern only for observational research. 
Error in measurement of outcome, on the other 
hand, is probably of equal concern in both obser¬ 
vational and experimental research. In well 
designed and conducted experiments, the exact 
exposure of each subject to the agent should be 
known and controlled by the investigator, and 
randomization should prevent confounding. In 
practice, it is usually desirable to measure key con¬ 
founding variables and control for them in the 
analysis because of the possibility that, by chance, 
important confounding may be present. In addi¬ 
tion, measurement error again becomes an issue 
when, in an experimental study, it is desirable to 
estimate exposure within the body at the level of 
tiie target tissue or target molecules. In these 
experimental situations, the problems and issues 
related to error in measurement of exposure are 
the same as they are in observational research. 

In this chapter, we shall consider in detail 
aspects of the measurement of exposure from the 
perspective of observational epidemiological 
research; we shall outline the exposure measure¬ 
ments that should be made, the effects of error in 
exposure measurement on QEP, the prevention of 
error in exposure measurement and the control of 
Its effects when it has not been prevented. These 
issues will also be considered in a more limited 
way for outcome measurement in epidemiological 
studies. The book Principles of Exposure 
Measurement in Epidemiology (Armstrong et al., 
1992) is acknowledged as a major source of mate¬ 
rial for this chapter. 
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4.2 Making the right exposure measurements 

4.2.1 Nature of exposure measurements 

Exposure variables should be specific in the 
sense that the}' measure indivisible agents of 
exposure. Thus, for example, it is better to mea¬ 
sure the different forms of tobacco smoking (cig¬ 
arettes, pipes, cigars, etc.) separately than geneti¬ 
cally as ''smoking". The dose-response relation¬ 
ship of smoking-related lung cancer differs by 
method of smoking (see, for example, Higgins et 
al., 1988) and use of a non-specific variable will 
lead to a dose-response relationship that cannot 
readily be interpreted in terms of any individual 
exposure and therefore used to undertake specific 
QEP. Similarly, the exposure variables measured 
should permit the distinction of the exposure of 
interest from possibly confounding exposures. In 
this regard, it is relevant to note that, in evalua¬ 
tions of the evidence for the carcinogenicity of 
chemicals in humans made by the International 
Agency for Research on Cancer, the commonest 
reason ilor applying a "limited" or "inadequate" 
evidence classification was that exposure to the 
agent of interest had not been distinguished from 
exposure to other potentially carcinogenic agents 
in the environment in which it had occurred 
(Armstrong, 1985). If such data cannot support 
qualitative assessment of risk, they will be even 
less useful for QEP. 

The exposure variables should also be sensitive 
in the sense that they include all ways by which 
subjects may be exposed to the agent of interest. 
For example, in examining the relationship 
between environmental tobacco smoke (passive 
smoking) and lung cancer a number of studies in 
non-smoking women have used the spouses 
smoking habits as the sole exposure variable. 
This variable excludes home exposures prior to 
marriage, home exposures other than to the 
spouse’s smoke, and workplace, social and other 
exposures outside the home at any time. The 
importance of multiple sources of exposure to 
environmental tobacco smoke has been shown in 
recall of lifetime exposure (Cummings et al, 
1989) and in comprehensive analyses of the 
determinants of current exposure (Coghlin et al., 
1989; Riboli et al„ 1990). A complete history of 
exposure is necessary, therefore, to obtain accu¬ 
rate measurements of exposure to environmental 
tobacco smoke. Variation in the error caused by 


incomplete (or insensitive) measurement of pas¬ 
sive smoking may explain some of the variation 
in the strength of the relationship that has been 
observed between it and lung cancer in non¬ 
smoking women (see, for example, Garfinkcl, 
1981; Hirayama, 1981). More recent studies have 
attempted more comprehensive coverage of pas¬ 
sive exposure to tobacco smoke (see, for example, 
Fontham et al., 1991; Stockwell etai, 1992). 

4.2.2 Measuremen t of dose 

For QRP, the dose should be measured in quan¬ 
titative terms, preferably as a dose rate (rather 
than the dose accumulated over, e.g. the whole of 
the period of exposure), in the most fundamental 
units in which the agent is usually measured For 
example, to obtain an accurate quantitative mea¬ 
sure of dose of the combined oral contraceptive 
pill, a complete history covering the nature and 
period of use of each kind of pill used should be 
obtained. Provided the exact formulations are 
identified (and this is not impossible; UK 
National Case-Control Study Group, 1989), not 
only can use of combined oral contraceptives be 
measured in terms of, say, years of use but also in 
actual weights of the oestrogenic and progesta- 
genic components consumed. Information on 
dose rate would allow measurement of intake in 
specific periods of time within the whole expo¬ 
sure period (see below). 

A number of levels at which dose may be mea¬ 
sured are important to QEP (Figure 4.1). The 
exposure, sometimes called the available dose, is 
measured in the environment external to the sub¬ 
ject, e.g. the concentration of asbestos fibres per 
ml of ambient air over a small time interval. The 
exposure is often the measurement used for regu¬ 
latory purposes, and may therefore be the quanti¬ 
ty most relevant to QEP. It does not usually coin¬ 
cide with tlie administered dose or intake, i.e. the 
actual amount of the agent coming into contact 
with the human body. How much of the expo¬ 
sure becomes administered dose depends on the 
subject's physiology and behaviour, e.g. the respi¬ 
rator} 7 volume at rest and during activity and the 
quantities of food, drinks, and medications actu¬ 
ally ingested. From a biological viewpoint, even 
the administered dose can usually only be regard¬ 
ed as a proxy measurement of the absorbed dose or 
uptake, i.e. the dose that actually enters the body- 



I The absorbed dose, in turn, is 
that really matters in terms of 
, the active or biologically effecth 
, the body (organs, tissues, cell 
! are the specific targets of actii 
It could be argued that, fc 
; five estimation of risk, the m 
' to measure would be the t 
j dose at the level of interact! 

targets (e.g. specific DNA a< 

I specific mutations) because it 
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Figure 4.1. Levels at which dose may be measured 


) The absorbed dose, in turn, is a proxy for the dose 
i that really matters in terms of causing the disease, 
the active or biologically effective dose at the sites in 
the body (organs, tissues, cells, molecules) which 
; ate the specific targets of action of the agent. 

1 It could be argued that, for accurate quantita- 
I five estimation of risk, the most appropriate dose 
: to measure would be the biologically effective 
| dose at the level of interaction with the cellular 
: targets (e.g. specific DNA adducts or exposure- 
j specific mutations) because it is measured after all 


preceding sources of variation (environmental 
conditions influencing available dose, behaviours 
affecting the administered dose, physiological 
factors affecting the absorbed dose and metabolic 
factors affecting the active dose) have been taken 
into account. On the other hand, available dose 
is the measurement most commonly (although 
not invariably) subject to public health regulation 
and therefore the one for which QEP is required. 
There is, then, a conflict between what might be 
best for estimation and what is needecl for predic¬ 
tion. 

If, as may be hoped, measurements of biologi¬ 
cally effective dose are able to fulfill their promise 
as accurate, quantitative predictors of risk of 
human cancer, at least in some circumstances, it 
may be that measurement of the biologically 
effective dose of some agents in exposed humans 
may become the most appropriate variable for 
public health regulation. In the meantime, it will 
still be important to strive for accurate measure¬ 
ment of available dose as well as to develop and 
evaluate the predictive capacities of absorbed and 
biologically effective doses and to understand bet¬ 
ter the relationships between these various levels 
of dose. 

4.2.3 Measurement of variation in exposure with 
time 

As far as possible, each exposure should be 
characterized by when it first began, when it 
finally ended (if at all), and how the dose rate var¬ 
ied during the period of exposure. The time rela¬ 
tionships of exposure are important for several 
reasons. Thus, duration of exposure is a critical 
determinant of the total amount of exposure that 
has occurred. Then, for a cancer occurring at a 
particular time, there exists a "critical" (or effec¬ 
tive) time window (the aetiologieally relevant 
exposure period) during which exposure can be 
relevant to its causation (Rothman, 1981). 
Inclusion of exposure outside that time window 
will lead to error in the exposure measurement (a 
case of non-specificity in the terms outlined 
above), and therefore to error in the estimation of 
dose-response. We do not generally know when 
the effective time window for exposure occurred 
for any particular exposure-cancer combination. 
The matter is further complicated by the possibil¬ 
ity of multiple effective time windows for an 
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Measurement of Exposi., 


agent when, for example, it has both early- and 
late-stage effects. For epithelial cancers, it has 
been commonly assumed that the effective time 
window was at least 10 years ago, and exposure in 
the last 10 or so years is therefore excluded from 
the analysis. This approach ignores late-stage 
effects. An alternative approach, such as that 
exempli bed by the “serially additive expected 
dose model" (Smith et al., 1980), may be used to 
seek the location of the effective time window 
empirically. This allows for the possibility that 
more than one such time window will be found. 
Full data on the time relationships of exposure 
may allow the construction of more complete 
and potentially more informative mechanistic 
exposure-response models for QEP than are cur¬ 
rently available (see Chapter 6). 

Finally, pattern of exposure during the expo¬ 
sure period may also be important. For example, 
to explain the apparently anomalous occurrence 
of a higher incidence of malignant melanoma of 
the skin in indoor workers than in outdoor work¬ 
ers, it was postulated that a particular total dose 
of sunlight may be more effective in causing 
melanoma if it is received intermittently or irreg¬ 
ularly rather than frequently or continuously 
(Holman et al., 1983). There is empirical evi¬ 
dence supporting this hypothesis (Armstrong, 
1988). Unless such pattern effects are taken into 
consideration, QEP may produce misleading 
results. 


Measurement of duration, dose rate, and pat¬ 
tern of exposure will permit the maximum flexi¬ 
bility in defining exposure in a biologically rele¬ 
vant way when analysing the relationship between 
exposure and disease. Alternatively, if there is no 
prior hypothesis regarding which representation of 
dose and time would be the most appropriate, the 
representation that best predicts disease incidence 
may then be sought empirically. In practice, how¬ 
ever, it may be difficult to distinguish between dif¬ 
ferent, possibly appropriate representations of 
dose, especially when many different characteris¬ 
tics of dose are included in statistical 
dose-iesponse models. 

Lee-Feldstein (1989) has given an example of an 
empirical search for the most appropriate represen¬ 
tation of dose in relation to time. She examined 
the quantitative relationships between exposure to 
arsenic in a copper smelter and risk of lung cancer 
in two cohorts of men, one first employed before 
1925 and the other employed between 1925 and 
1947. Exposure was defined in terms of job with 
maximum exposure to arsenic (a "peak" exposure 
definition), cumulative exposure based on arith¬ 
metic or geometric mean levels of arsenic in spe¬ 
cific job areas, and time-weighted average expo¬ 
sures based on arithmetic or geometric means. 
Analyses were carried out with and without exclu¬ 
sion of exposure in the last 10 years before death. 
Table 4.1 shows the results obtained for both 
cohorts with three different definitions of expo- 



Definition of exposure First cohort 


Second cohort 

j 

A. . 

P b 

SE of P b 

Likelihood 
ratio statistic 
forP = Q 

A, 

P 

SEof fl b 
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Maximum exposure category. 0.55 
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0.30 
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j sure. The cumulative and time-weighted average 
j exposures shown are those based on the arithmetic 
| means since they gave slightly higher likelihood 
j ratio statistics than those based on geometric 
; means. In the first cohort, using the likelihood 
ratio statistic in each case as a criterion, maximum 
exposure statistically showed the strongest associa¬ 
tion with lung cancer, whereas time-weighted aver¬ 
age and cumulative exposure were similar hi their 
association. In the second cohort, however, the 


i 

! 

| 

j 

i 
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time-weighted average appeared to be more strong¬ 
ly associated with lung cancer than cumulative 
exposure, and the maximum exposure category 
was intermediate between the two. It appears, in 
this case, that some measure of intensity of expo¬ 
sure over time was important in determining risk. 
The values in Table 4.1 were little affected by 
removing the last 10 years of exposure from the 
exposure estimates. 

With respect to time, it is relevant to note that 


some measurement instruments relate only to 
comparatively short periods of exposure and are 
j therefore of limited use for QEP of cancer, unless 
! multiple measurements can or have been taken 
, over- the aetiologically relevant exposure period or 
| exposure can be considered to be reasonably con- 
: stant over time. For example, measurement of the 
1 concentration of asbestos fibres in air relates, strict¬ 


ly speaking, only to the period of time over which 
i the air sample was taken, measurement of urinary 
cotinine reflects exposure to tobacco smoke only 
within the past 3-4 days (Riboli et al., 1990), and 
measurement of adducts of aflatoxin E, with 
serum albumin reflects intake of aflatoxin B, over 
the preceding several weeks to a few months (Wild 
! et al, 1990). Unless records exist, such measure- 
| ments may be of little value in documenting expo- 
| sure in a case-control study of cancer in which the 
aetiologically relevant exposure period may have 
| been 20 or 30 years ago. Similarly, they are of lim- 
i ited value in prospective cohort studies unless they 
' are repeated at intervals during the follow-up peri- 
j od. If repeated, they could provide the best possi¬ 
ble data on dose rate and pattern of exposure 
, obtainable. Such data, however, may bring addi- 
; Uonal problems, such as the non-independence of 
j *he multiple measures in any individual and the 
| Possibility that the errors in measurement may 
l vary with time, e.g. because of changes in the tech- 
: nologyused. 


Methods that purport to be able to measure 
exposure over long periods of time also have lim¬ 
itations. First, even if accurate (as, for example, 
measurements of UV-specific mutations of the 
p53 gene in normal skin might be for UV expo¬ 
sure; Nakazawa et al., 1994), they may provide 
information on cumulative exposure only and 
tell us nothing about dose rate and pattern of 
exposure. Second, when based on human recall, 
as commonly they must be, they are usually of 
uncertain, if not doubtful, accuracy. 

4.3 Effects of exposure measurement error on 
QEP 

Measurement error can be simply conceived as 
follows: 

x f = 71+ 

where the observed measure, X v differs from 
the true value T x by the systematic error or bias, b, 
which occurs, on average, in the measurements of 
all measured subjects, and the non-systematic 
error, E p that varies unpredictablv from subject to 
subject (Armstrong et al, 1992), X, T, and E are 
variables with distributions; their expectations 
are denoted by fly, p T and p £ and their variances 
by cr 2 x , <r 2 r and rr J £ respectively. Because the aver¬ 
age measurement error in X is expressed as a con¬ 
stant, b, it follows that \i E , the population mean 
of the non-systematic measurement error, is 
assumed to be zero. 

When two or more groups of subjects are being 
compared, measurement error can be either dif¬ 
ferential or non-differential. It is differential if 
the value of b (the bias) differs between the 
groups or if the precision of X (the observed mea¬ 
surements) differs between the groups (due to dif¬ 
ferences between the groups in the value of a 2 £ J, 
or if both are true, [f neither is true, the error is 
non-differential. 

4.3.1 Effect of exposure measurement error on 
the dose-response relationship 

The effect of error in the measurement of the 
primary exposure on the empirical dose-response 
relationship depends on (1) the type of study, 
namely whether it is analytical (cohort or 
case-control) or ecological; (ii) the type of mea¬ 
surement, namely whether it is continuous or cat¬ 
egorical; and (iii) whether the error is differential 
or non-differential. 
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4.3.1.1 Analytical epidemiological studies 

In general terms, it can be said that, when a 
continuously distributed measure of exposure is 
related to disease outcome in a logistic model 
and there is differential error in measurement of 
the exposure, the observable odds ratio (QR 0 ) 
for the disease, per unit of exposure, bears no 
predictable relationship to the true odds ratio 
(OR r ). In other words, it may be closer to the 
null value, further from the null value, or on 
the other side of the null value (e.g. below 1.0 
instead of above 1.0) from the true value. 

If error in measurement of a continuous 
variable is non-differential its effects may be 
more predictable. If it is assumed that the 
degree of measurement error is uncorreiated 
with the true value of the measurement and the 
true values are normally distributed and have 
the same variance in subjects with and without 
disease, it can be shown that non-differential 
error in exposure measurement causes flatten¬ 
ing of the curve of the relationship of the odds 
ratio with exposure towards the null value 
(odds ratio constant at 1.0 with increasing 
exposure; see Figure 4,2). The relationship 
between the observable logistic regression coef¬ 
ficient, (J 0 (which is related to the observable 
odds ratio for u units of exposure as follows, 
OR 0 = exp(p 0 u), is: 

~ P 2, rxPr 

In this expression, T and X are the true and 
observed values of the measurement, respec¬ 
tively, and p is the coefficient of correlation of 
T with X, the validity coefficient, and p r is the 
true logistic regression coefficient. Thus P 0 , 
and therefore OR 0 , falls as the value of p falls. 
For example, when p has a value of 0.5, an OR T 
of 2.0 for one unit of exposure is attenuated 
substantially to an 0R 0 of 1.19. 

When the assumptions of the preceding 
paragraph are not true, the odds ratio curve 
shown In Figure 4.2 can take a variety of shapes 
including convex upwards, convex downwards 
with the lowest odd ratios falling below the 
null value, or even sigmoid. 

When the exposure variable is categorical, 
the effects of measurement error on the expo¬ 
sure-disease and dose-response relationships 



Figure 4.2. Relationship between observable logistic regres¬ 
sion coefficient and true logistic regression coefficient for dif¬ 
ferent values of true and observed measurement (see text lor 
details) 

are analogous to those when the variable is 
continuous. If there is differential error in 
measurement, the observable odds ratio (or 
odds ratios if there are more than two exposure 
categories) may bear almost any relationship to 
its true value. 

If there is non-differential measurement 
error, the effect depends on whether there are 
two or more than two exposure categories. 
When there are two exposure categories, 0R 0 
will be biased towards the null value relative to 


j 4.3.1.2 Ecological studies 
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The position is more complex and less pre- j bon of confounding. When 
dictable when there are more than two expo- '■ sured with non-differential 
sure categories (Biikett, 1992). Briefly, when j founder is measured perfect 
risk of disease truly increases monotonically ; ease-exposure association ms 
with increasing exposure, the 0F o for the most , the null value to an even gn 
extreme exposure category will always be , crude association. When boi 
biased towards the null while that for inteime- errob the adjusted disease-* 


diate categories may be biased away from it. • may, with reference to the i 
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all direction of the relationship cannot be bally much greater, and its c> 
reversed. However, OR 0 for the lowest category , (Greenland & Morgenstem, 
can cross over the null, and a monotonic , 1992). The effects of error in 
dose-response relationship may thus appear as Confounders on the results of * 
a U- or J-shaped curve. _ . not been described 
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4.3.1.2 Ecological studies 

J The effect of measurement error on 
; dose-response estimation in ecological studies has 
j only recently been considered (Brenner et a!., 1992). 
! For an ecological study in which exposure was mea- 
; sured by the proportion of individuals in each pop- 
; ulation who were exposed, it was shown that bias in 

■ estimated relative risks was always away from, rather 
than towards the null value and that the bias could 
be quite large. It appears, however, that when the 
exposure of each population is estimated by a single 

: common measure (e.g. area air pollution), as would 
be the most useful for the quantitative estimation of 
< risk, the bias would usually be towards the null 
! value (Brenner et al., 1992). This case, however, has 
not been rigorously worked out. 

4.3.2 Measurement error in confounding 

Error in the measurement of a confounding vari- 
. able influences the results of statistical 
! control of the effects of this variable on the 
: dose-response relationship of the primary exposure 
1 variable with disease. 

! 

j 4.3.2.1 Analytical epidemiological studies 
i "When the confounder is measured with non-dif- 
, ferential error and the exposure is measured perfect¬ 
ly, the effects of the confounder will be incomplete- 
: iy controlled and the effects of the exposure, inde- 
j pendent of the confounder, will appear greater or 
j smaller than they really are, depending on the direc- 
j tion of confounding. When the exposure is mea¬ 
sured with non-differential error and the con¬ 
founder is measured perfectly, the adjusted dis¬ 
ease-exposure association may be biased towards 
, the null value to an even greater degree than the 

■ crude association. When both are measured with 

■ error, the adjusted disease-exposure association 
! may, with reference to the true disease-exposure 

association, be biased towards or away from the 
nuH. 

4.3.2.2 Ecological studies 

Confounding in ecological studies is more com¬ 
plex than in analytical studies, its effects are poten- 
• dally much greater, and its control more difficult 
; (Greenland 6c Morgenstern, 1989; Greenland, 

] 1992). The effects of error in the measurement of 
; confounders on the results of ecological studies has 
not been descrihed. 


4.3.3 Example of effects of exposure measure¬ 
ment error on QEP 

Much work has been carried out to evaluate 
and take into account sources of systematic and 
random error in the radiation dosimetry of the 
atomic bomb survivors. Around 1986, a new 
dosimetry system, "DS86", was introduced to 
replace the dose estimates, referred to as "T6SD", 
which had previously been used for risk estima¬ 
tion in studies of atomic bomb survivors. 
Correlations of doses with chromosome aberra¬ 
tion data indicate that DS86 doses are more accu¬ 
rate than T65D doses, particularly for some 
groups of survivors with complicated shielding 
patterns. DS86 doses were estimated for a num¬ 
ber of specific target organs and are also more 
accurate than T65D doses, which were based 
only on the application of a set of organ transfer 
factors. Overall, DS86 neutron doses were less 
than the corresponding T65D ones. DS86 doses 
from y-rays, however, were generally larger than 
the corresponding T65D ones, the difference 
increasing with distance from the epicentre and 
varying with the characteristics of shielding, irre¬ 
spective of whether the survivor was a child or an 
adult at the time of the bombing and the target 
organ. 

The introduction of DS86 in risk estimates 
coincided with a change in the risk models used 
to estimate and predict risk from data on atomic 
bomb survivors and with the analysis of extend¬ 
ed follow-up data. The results of the life-span 
study follow-ups (Preston et al„ 1987; Shimizu et 
al., 1990) are therefore not strictly comparable. 
However, Preston & Pierce (1988) carried out 
analyses to evaluate the impact of the change in 
dosimetry on risk estimates. Their results indi¬ 
cate that the improvement in the accuracy of the 
dose estimate resulted in a 75-85% increase in 
estimates of radiation-induced mortality from 
leukaemia and from all other cancers. 

Following this, methods were developed 
(Pierce et al., 1990, 1992) to adjust risk estimates 
for the biases resulting from random errors in 
individual dose estimates. The magnitude of the 
random errors varied with dose, being greatest 
for survivors with doses above 4 Gy. It was con¬ 
cluded that random errors of 30-40% can lead to 
underestimation of the risk of solid cancers by 
7-11% and of leukaemia by 4-7%. 
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The adequacy of neutron dosimetry, particu¬ 
larly in Hiroshima, has recently been ques¬ 
tioned (Straume et ah, 1992). It appears that 
DS86 does not fully account for the neutrons 
probably present at Hiroshima, particularly at 
the greater distances, and additional work will 
need to be carried out to improve the dosime¬ 
try. 

4.3.4 Conclusions 

Error in measurement of exposure can have 
substantial and largely unpredictable effects on 
the direction and shape of dose-response rela¬ 
tionships. The consequences of exposure mea¬ 
surement error for QEP are potentially serious, 
therefore, unless the effects of error can be con¬ 
trolled or error can be prevented from occur¬ 
ring in the first place. 

4.4 Prevention of exposure measurement 
error 

To the extent that an economically and 
logistlcally feasible way exists to achieve the 
prevention of errors in measurements of the 
primary exposure and potential confounders, 
this is the best way to control the effects of 
measurement error and therefore to minimize 
its effects on QEP. There are two general 
approaches to the prevention of measurement 
error: (i) quality in the design of measurement 


instruments; and (ii) quality control in making 
and using measurements. 

4.4.1 Quality in the design of measurement instru¬ 
ments 

Instruments for measuring exposure to car¬ 
cinogens for the purposes of QEP should be 
designed to make the right measurements accu 
rately. There are few general prescriptions for 
accuracy at the design stage as its determinants 
vary from one kind of measurement instrument 
to another. The removal of human subjectivity, 
however, is a desirable characteristic of all mea¬ 
surement processes. In the administration of a 
questionnaire by interview, for example, this will 
require the use of carefully trained interviewers, a 
highly structured questionnaire, and keeping the 
interviewers ignorant, in the specific situation, of 
whether they are interviewing a person with or 
without disease or with or without exposure. In 
laboratory measurements on environmental or 
biological samples, it will require the processing 
of ail samples without those concerned knowing 
whether they come from subjects with or without 
disease. 

Some specific procedures for ensuring high 
quality in the design of instruments for the field 
collection of epidemiological data, including the 
collection of biological samples, are summarized 
in Table 4.2. 



Design of forms: 

Include al! items needed to compute dose, timing of exposure, etc. 

Include adequate subject Identifiers—at least an identity number and a check digit or alphabetic code on all forms 
Use separate forms for each method of exposure moasu r ement 

Make instructions clear and data collection items unambiguous 

Use different typefaces for instructions, data collection items and responses 


' Provide mutually exclusive and exhaustive response categories for closed-ended items 

Mate-bans self-coding for simple hems, e.g. data collector circles a number corresponding to ihe appropriate response category 
Make response codes consistent within and across terms, u.g. 1 = yes, £ - no 

provide for coding without loss of information, Le. do not design forms so that continuous data are categorized at the 
ceding stage _ 


Measurement of Exposure ai 

\ 

i 


Design of forms (eontd): 

^ Do not require computation by 

j Design forms for direct entry of e 

! Design of specimen collection: 

I s Design procedures for specimen ■ 
laboratory procedures 

j study procedures manual: 

Always have a study procedures i 
| l ndude at least the following in th 

j » Description of the sway in ye- 
1 . > Sample selection, resniiiment 

j - 'Data forms 

j . - General'methods o! data colli. 

“ Item by item clarification of qir 
• Detailed procedures for the a 
» Editing procedurea 
» Coding instructions for items 
* Codebooks 

; Update manual and distribute up>f- 

j Pretesting Instruments: 

1 " : Have instruments reviewed by oth 

j Pretest instruments on samp es of 

| ■; Train data collectors and pretest ip- 

j l Identify problerns through feedback 

j r observing interviews, reabstradirrg 

i j Review frequencies of responses !■ 

j j. Modify instrument 

j | Pretesting Instruments: 

j ;■ Discuss importance of complete an- 

] ; Review study manual 

j f. Practise data collection 

I j Monitor initial data collection by eae 
: Resolve problems 

ji ,1 Source: Armstrong el si. (1902) 
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Toble 4.2 (Contd) Quality In the design of, measurement procedures for the ocillecSon.ol.data on 

exposure in.epidemiological studies" ^ 


Design of forms (contd): 

Do not require computation by data collectors, rather make provision on the form for entry oi raw data into the computer 
Design forms for direct entry at data into the computer 
Design of specimen collection: 

Desigh’procedures for specimen collection; idertiiioation, transport and storage in consultation with specialists in the 1 
laboratory procedures 

Study procedures rncmirlr 

Always have a study procedures manual 

Include at least the following in Ihe study procedures manual: 

• Description of the study in general terms / 

• Sample selection, recruitment auc,' hacking procedures 

• Data forms. 

• Gene/at methods of rials collection 

• item by' Hem clarification of questions and responses on terms ann questionnaire, including special cases 

• Detailed procedures for the collection of biological samples 

• Editing procedures 

• Ceding instructions for items not self-coded or, form 

» Cvdabccks ' 

Update manual and distribute updated pages whenever procedural changes are made 
Pretesting Instruments: 

Havo instruments reviewed by other researcheiis 
Prctc-st instruments on samples of convenience 

Train data collectors and pretest instruments on samples similar to study subjects 

Identify problems through fnnoback fiom pretested subjects and date collectors and by monitoring data cc.lection { 0 . 9 .• 
observing interviews, reabstraciing records) and make appropriate changes as early as possible 

Review frequencies of responses to identify Hams with little variation in responses 

Modify instrument 

Pretesting Instruments: 

Discuss importance of complete and accurate date 
Review study manual 
Practise data collection 

Monitor initial data collection by each data collector 
Resolve problems 


data are categorized at the 


Source: Amistronn eta/., (W32) 
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Measurement of Exposure and O* 


4.4.2 Quality control in making and using mea¬ 
surements 

The most important steps in quality control in 
making and using measurements are ensuring 
that the measurement procedures are written 
down, that specific quality-control measures form 
part of these procedures, and that all staff 
involved in making measurements are adequately 
trained. Table 4.3 summarizes specific quality- 


control procedures in the field collection, pro¬ 
cessing and statistical anaiysis of epidemiological 
data. Some major aspects of quality control in 
the laboratory analysis of biological or environ¬ 
mental samples collected as part of an epidemio¬ 
logical study are also listed. More comprehensive 
coverage of quality control in analytical laborato¬ 
ries can be found in Westgard & Klee (1986) and 
Copeland (1989). 



Quality control during data collection 


Supervision ol data collectors: 

Assign cases and controls in a case-control study (or exposed and uncxposod subjects, where this is known in 
advance, in a cohort study) in the same proportion io each uata collector 

Maintain ignorance of data collectors to case-control (exposcd-uncxposed) status of subjects, as isr as possible 

Replicate some proportion ol data collection (e.c. 10% of subjects) to identify fictitious data, items with poor reliability, 
data collectors with errors on certain items, etc. 

Compare the distribution of study variable:; among data collectors 

Compare distributions of study variables over time 

Address problems identified through monitoring immediately with the relevant data coiiector(s) 

Conduct staff meetings for retraining, discussion cf problems and motivation 

Editing and coding: 

Have data collectors edit data forms Immediately to clarity responses and check for missing items 


| ; il l Mi i fTrCT'- CTfsPraiiflf^hF lif 

j - Handling and analysis of biological 

j . Ensure, as far as possible, that match 
I ; Jects with and without disease, are inc 

j Undertake ail laboratory analyses with 

i •: -fmpiement local, within-batch quality ■: 

„ Optimally, participate also in a prograi 
j, able, assay blind duplicates of some £ 

: Quality control curing data process 

i Key entry : 

? Create a codebook with format and cr 
Enter data contemporaneously with d» 

I- Double, enter (verify) all data 

f Edit data by computer by performance 
j j Correct errors and iecri back findings • 

j: Creation of new variables: 

j !* * 

j b Check and recheck the programming ■: 

i ( Check the carieetness of new variat-' 
j , . ably possihle 

j j, Review distributions of original and oiC' 

! Create a codebook with detailed riescri 

[ code used to create them 


Have supervisor perform a second edit scon after data collection to check for missing items, inadmissible nodes, incon- j *' --——- 

sistencies among responses," illegible responses, etc. - 1 i: . a Source:Armstrong etal., (1992) 


Have editor code open-ended questions and query those inadequately answered 
Correct errors by call back to subjects (or ehcck-backs to records) 

Have one staff member maintain an editors’ log to ensure consistency cf recording and coding of unanticipated 
responses, and to record comments and responses coded as “other 

Handling and analysis of biological samples: 

Ensure as far as possible that samples are collected at around the same time hum matched subjects with and without 
disease, or numerically balanced samples of subjects with and without disease, so that their storage and analysis times 
can be similar 

Record time and conditions of collection, initial storage, transoort, entry into secondary storage, and removal for labora¬ 
tory processing, review periodically and investigate reasons for departure rrom agreed standards 

Reasons should be given by the laboratory for rejection of samples as unacceptable and Investigated in case they poinl 

to errors in collection, storage and transport procedures that cart be.corrected 


4.4.3 Other factors 
j Much good advice can be given 
; vention of error in measurement oj 
it is self-evident that at least some 
duxes proposed above will lead to 
How free from error measurement: 
when all of these procedures are 
depend very much on the exposu 
sured and the options available fc 
ment. , 

i Some exposures, such as the sm 
rettes, are comparatively easy to me 
so because, traditionally, cigarette; 
been a habit commenced in youth] 
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Table 4.3 (Contd) Quality control in making.and using exp 

epidemiological studies* 


Handling and analysis of biological samples (contd): 



Ensure, as far as possible, that matched subjects with and without disease, or numerically balanced samples of sub¬ 
jects with and without disease, are included in each batch of laboratory analyses 


Undertake all laboratory analyses without knowledge of which samples come from subjects with and without disease 

Implement local, within-batch quality central and standardization procedures according to best laboratory practise. 
Optimally, participate also in a programma of external standardization of Ihe analyses. If reference sample are not avail¬ 
able, assay blind duplicates of some samples fn the same batch and in different batches 


Quality control burins oata processing 


, Key entry: 

! Create a codebook with format and code of “raw’ data 
i Enter data contemporaneously with data collection „ 

i Double enter (verify) all data 

' Edit data by computer by performance of range and logic checks contemporaneously with data entry 
Correct errors and feed hack findings of relevance to data collection 


Creation of new variables: 

Check and recheck the programming code used to create new variable 

Check the correctness of new variables by manual computation from a sample of original records, whenever reason¬ 
ably possible 


Review distributions oi original and created variables 


ems 

radmlssible codes, incon- 


Create a codebook with detailed descriptions of new variables created including original variables and programming 
code used to create then'! 


- 1 Source: Armstrong eta ]., f19S?i 


of unanticipated 


bjects with and without 
Trage and analysis times 

and removal for labore¬ 
rs 

gated in case they point 


I 4.4.3 Other factors 

| Much good advice can be given about the pie- 
! vention of error in measurement of exposure and 
i it is self-evident that at least some of the proce- 
j duies proposed above will lead to its reduction. 
. How free from error measurements may become 
! when all of these procedures are adopted will 
depend very much on the exposure being mea- 
’ sured and the options available for its measure- 
, nient. 

j Some exposures, such as the smoking of ciga- 
: rettes, are comparatively easy to measure. This is 
! so because, traditionally, cigarette smoking has 
been a habit commenced in youth and indulged 


in more or less continuously until the habituee 
relinquishes it, usually in middle life, or dies. Its 
adoption and relinquishment are important 
events and are therefore easily remembered. 
Moreover, the number of cigarettes smoked per 
day is reasonably constant and also easily remem¬ 
bered because the smoker must regularly purchase 
fresh supplies. However, while empirical data 
indicate that cigarette consumption is recalled 
well if constant, it may be recalled with bias if con¬ 
sumption has changed (Persson & Norell, 1989). 
Prospectively collected data on cigarette consump¬ 
tion have been used to produce plausible QEPs 
(see, e.g. Doll & Peto, 1978; Doll et at., 1994). ' 
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However, the measurement of cigarette smok¬ 
ing and, in particular, the consequential expo¬ 
sure to carcinogens, has become more compli¬ 
cated over the past 30 years because of changes 
resulting from concerns about the health effects 
of smoking. The use of filter cigarettes has 
become almost universal in industrialized coun¬ 
tries, cigarettes that deliver smaller amounts of 
tar have been designed and their use has 
increased substantially. Thus, to estimate the 
intake of tar from tobacco smoke retrospectively 
now requires that smokers recall when they took 
up the use of filter cigarettes and the brands that 
they smoke—a rather more demanding task. 
Smokers may now also tend to underestimate 
their cigarette consumption because of social 
disapproval of their habit. While none of these 
problems is insurmountable, it is now more dif¬ 
ficult to obtain an accurate measure of intake of 
tobacco smoke carcinogens (a complex mixture 
for which simple cigarette consumption was a 
proxy) than it used to be. 

Some Other exposures are extremely difficult 
to measure no matter how carefully they are 
approached. For example, dietary intake of sodi¬ 
um cannot be measured accurately by any form 
of dietary assessment except the observation, 
weighing and sampling, for analysis, of all food 
eaten in some defined period of time. Much 
simpler than this, although by no means easy, is 
the measurement of urinary sodium excretion 
which, if carried out over 14 periods of 24 h, 
provides a reasonably accurate reflection of pre¬ 
sent sodium intake (Liu et al. r 1979). Neither of 
these approaches can provide a measure of any¬ 
thing other than present intake of sodium unless 
repeated. Thus accurate retrospective measure¬ 
ment of individual intake of sodium for, say, a 
case-control study of cancer of the stomach 
(with which it may be associated) must be con¬ 
sidered to be impossible unless it can be 
assumed that dietary sodium intake is reason¬ 
ably constant over time. This assumption is 
unlikely to be true. 

Between these two extremes lie most other 
exposures of interest in quantitative estimation 
and prediction of risk of cancer. All are mea¬ 
sured with, some error, Thus steps must be taken 
to control the effects of the error or it will influ¬ 
ence the results of QEP. 


Measurement of Exposure and 


4.5 Centro! of the effects of exposure mea¬ 
surement error 

Issues relating to the control of the effects of 
exposure measurement error have been dis¬ 
cussed by Willett (1989). There are three possi¬ 
ble approaches to control when measurement 
error has not been prevented (Greenland St 
Robins, 1985; Armstrong et al., 1992): (i) the use 
of multiple measures of exposure; (ii) the adjust¬ 
ment or study results for the effects of measure¬ 
ment error; and (in) in some circumstances, 
adjustment in the analysis for a covariate which 
is related in some way to the occurrence of error 
in the exposure measurement. In principle, 
each of these approaches can produce a more 
accurate representation of the dose-response 
relationship and therefore allow more accurate 
QEP. 

4.5.1 Multiple measures of exposure 

The average oi some other combination of 
the information in multiple measures of the 
same exposure for each individual in an analyti¬ 
cal study (or population in an ecological study) 
is, in principle, a more accurate measure of 
exposure than a single measurement because of 
reduction in within-subject variability 
(Marshall, 1989; Brenner & Blettner, 1993). The 
multiple measures may be either repeated mea¬ 
sures using the same (e.g. repeated collection of 
dietary diaries over a one-year period) or differ¬ 
ent instruments (e.g. a combination of measure¬ 
ments from a food frequency questionnaire and . 
a period of collection of dietary diaries). There 
may be no advantage and may even be a disad¬ 
vantage in the latter approach if one of the two 
instruments is substantially less accurate than 
the other. The average of the two measurements 
may then be less accurate than the measurement 
with the more accurate instrument. 

The use of multiple measures of exposure is 
most unlikely to fully control the effects of mea¬ 
surement error. In the case of repeated applica¬ 
tions of the same instrument, for example, no 
correction will be obtained for bias in measure¬ 
ments or in error which differs between subjects 
with and without disease. While some correc¬ 
tion of both may be achieved if two different 
measurement instruments are used, full correc¬ 
tion will only be possible if both instruments 


were perfect in the first place o 
opposite biases, which is high! 

4.5.2 Adjustment of study result 
measurement error 

Table 4.4, taken from Armsii 
gives equations expressing ■ 
regression coefficients or odds 


Equation 6 


Continuous exposure, dichotomous 



flT= 13,;, :'P 3 TX 


or 7 -- ofit, 1 <v n 

Continuous exposure, continuous 01 


i 


Pj-IVAx 

Dichotomous exposure, rikt,o:omcu. 



F n (1 - P.) 


where ?,.= (p 0 -1 + spocjfee 

and P a - ip N -1.+ spec^fsens. 


a Source: Armstrong efflt(1S32). 

" p -and p c ara inis and observable 
the observable rneeburentente in h: - ;q - 
nunts in diseased and non-diseassd 
debt), p p £ind p N are the observable pn. 
the observable measure mo rtfs in -See? 
au ereents in diseased and non-.’ar¬ 


il! practice, the position is mm: 
First, there may be no satisfactory 
the validity of the exposure measi 
made because no perfect meas 
exists. A comparison of the meii 
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sffscts of exposure mea- 

the control of the effects of 
ent error have been dis- 
89). There are three possi- 
mtrol when measurement 
prevented (Greenland & 
■mg et al, 1992): (i) the use 
of exposure; (ii) the adjust- 
i for the effects of measure- 
■ in some circumstances, 
lysis for a covariate which 
to the occurrence of error 
surement. In principle, 
hes can produce a more 
n of the dose-response 
fore allow more accurate 


5 of exposure 
e other combination of 
-Utiple measures of the 
individual in an analyti- 
in an ecological study) 
e accurate measure of 
ueasurement because of 
n-subject variability 
& Blettner, 1993). The 
re either repeated mea- 
. repeated collection of 
'-year period) or diffei- 
mbination of measure- 
:ncy questionnaire and. 
dietary diaries). There 
I may even be a disad- 
nach if one of the two 
lly less accurate than 
die two measurements 
han the measurement 
■torment. 

■asures of exposure is 
rol the effects of mea- 
e of repeated applica¬ 
nt, for example, no 
! for bias in measure- 
fers between subjects 
While some correc- 
-ved if two different 
are used, full correc- 
if both instruments 


were perfect in the first place or had had exactly 
opposite biases, which is highly unlikely. 

4.5.2 Adjustment of study results for the effects of 
measurement error 

Table 4.4, taken from Armstrong et al. (1992), 
gives equations expressing the true logistic 
regression coefficients or odds ratios in terms of 


their observable values and exposure measure¬ 
ment error. In principle, these equations can be 
used to estimate the true values given the 
observed values and estimates of the error 
expressed in appropriate terms. Estimates of the 
error can be obtained by means of a validity 
study in a sample of the subjects in the epidemi¬ 
ological study. 


1 Table 4,4. .Equations for the true measure of association as a function of thc qhscryablc 1 
< measure of association and the exposure measurement 1 

Equation 1 ' 

Type of error 

Continuous exposure, dichotomous outcome: 


' ,, .j b O~ b N ■ <' : 0 

■ '' — ri>r_ P 2 tx 

Differential ... 

■ Pr=Pq/P 2 7,v 

Non-differential 

■ Ofl r = OR n ' 

Non-differential 

Continuous exposure, continuous outcome: 


Pr= Pp/PSx 

Non-differential 

Dichotomous exposure, dichotomous outcome: 


§ 

ii 

.hi hj 

\ i 

,h3 jd 

Diffe rential o r non .differential 

where P„ = (p 0 -1 + spee c )/(senSp-t speCp-1) 

and P f . = (p N - 1 + spec ft .)/(sens K + spec,,- 1 ) 



I * Source: Armstrong et at (1982). 

' ; P-ared (5 0 are true and observable ic-gisfe; regression ccefTcienta, OR r tt'i Of? 0 are true and observable odds ratios, & 0 and are bias in 

■ the observable measurements in diseased and nan-diseased (N) subjects, n Xo and are expectations of the observable measure- 

f merits in rfisEssed and mm-diseased subjKte, p n . is the correlation between the true and observable measurements (the validity coeffi- 

clsnt), p D amt p N are the observable proprati nns exposed in diseased and non-diseased subjects, spec D and spec w are the specifiers of 
the Observable measurements in IftKsed and norvdiseased subjects, and sens , end sens^are the sensiMes otthe observable maa- 
' surcmciits in dheared and norhdfeeased subjects. 

in practice, the position is much more complex. with what is presumed to be a more accurate 

First, there may be no satisfactory way of measuring method may be made to provide an upper limit 

the validity of the exposure measurements actually estimate of the validity coefficient, p 7X , in the case 

made because no perfect measure of exposure of a continuous measurement of exposure, and 

exists. A comparison of the method actually used therefore a probably conservative correction for the 
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Measurement of Exposure 


effects of measurement error. Second, the equa- An example of the adjustment of a continuous 
tions in Table 4.4 assume, among other things, that dose-response relationship with reference to a val- 
tire extent of measurement error is uncorrelated idation study carried out in the same population of 

with the value of the true exposure. If this and subjects is shown in Table 4.5, based on the US 

Other assumptions are not true, the adjustment will Nurses' Health Study (Rosncr et al., 1989). This 

not be valid. Third, allowance needs to be made for examines the effect of using a comparison of calo- 
sampling error in the observable disease-exposure rle-adjusted, daily intake of saturated tat, measured 
association and the measurements of error. Finally, both by food frequency questionnaire and by mul- 
information on the multivariate measurement error tiple daily records of weighed dietary intake in a 

structure of the primary exposure variable and the sample of 173 nurses, to adjust the estimate of rel- 

confounding variables is required if an adjustment ative risk per 10 g increase in calorie-adjusted satu- 

for error in the measurement of all relevant vari- rated fat intake based on the food frequency' ques- 

ables is to be attempted. tionnaire. As would be expected, the adjusted rel- 

While work has been done on a number of the ative risks are further from the null value (1.0) than 
problems listed above, and procedures are available the unadjusted estimates, and their 95% confi- 

that take account of some of them (Armstrong et dence intervals are wider because of the compo¬ 
rt!., 1992; Thomas etaL, 1993), it will in most cases nent of sampling error in the coefficient of regres- 

be difficult to overcome the difficulty that a source sion of the the two estimates of adjusted saturated 

of true measurements against which a sample of the fat intake. 

measurements actually obtained can be compared While the bias in the unadjusted estimate, in 
is lacking (Wachoider et al, 1993). Therefore, while this example is apparently not large the adjusted 
dose-response relationships adjusted for the effects estimate is open to question. First, while the 
of measurement error may be nearer the truth than recording of weighed dietary intake is probably the 
the unadjusted relationships, they are unlikely to most accurate method of estimating nutrient 
he dose to perfect. Moreover, because the sensitiv- intake, it is not perfect. For example, the close 
ity and spedficity or the validity coefficient will observation of diet that it entails is likely to change 
often have been derived from a smail sample of dietary intake in unpredictable ways. Second, 

subjects, the sampling error in any adjusted while the 28 days of recording were spread 

dose-response relationship or prediction of risk will throughout one year, they could only relate to diet 

be correspondingly large. in the recent past. The food frequency question- 



Validation study 


Main cohort stnrfy 


Days of d ! ri record 

Esfillcllf; of 
rcgtcr&ion 
coalfi-iant (SE) n 

Observed relative 
lir-k ;S5% Cl) c 

Adjusted relative 
risk (95% Cl) c 

38 

0.468 (0,048) 

0.92 {'0.00-1.05) 

0,83(0.61—1.12) 

2 

0.540(0X90} 

0.92 (0.80—1 

0.35(0.65-1.11} 



1 * 3as9d on tfala from Rost** *> 1 at. (1933). 

& Coefficient of linear regression of tviiimated cabiig-rvljitcilfia saturated tot i vake from ciei rccu c.s on estimate from food frequency ques¬ 


tionnaire. SE = standard errrr. 

* Relatfv? rtelc par 10g iheraate [\\ fstumfod Ini^. Ci = carckJenc* Wenial. 
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1988). Such strategies will st 
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1 . • 1 
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| : risk.-' 
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t in the same population of 
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Kosner et al., 1989). This 
ising a comparison of calo- 
e of saturated fat, measured 
questionnaire and by mul- 
eighed dietary intake in a 
o adjust the estimate of rel- 
ise in calorie-adjusted satu- 
h the food frequency ques- 
expected, the adjusted rel- 
)tn the null value (1.0) than 
es, and their 95% confi- 
er because of the compo¬ 
rt the coefficient of regres- 
lates of adjusted saturated 

e unadjusted estimate, in 
tly not large the adjusted 
estion. First, while the 
:ary intake is probably the 
of estimating nutrient 
For example, the dose 
entails is likely to change 
’dictable ways. Second, 
recording were spread 
f could only relate to diet 
•ood frequency question- 


t easuremcnt of f ; : * im 
irateci fat intake 


naire, on the other hand, could, in prindple, have 
related to diet over a much longer period. Neither, 
however, is likely to have been an accurate mea¬ 
sure of the period of life in which dietary variables 
may be most important in breast cancer aetiology, 
namely, childhood (de waard, 1986). 

4.5.3 Control of a covariate related to measure¬ 
ment error 

It is common in epidemiology to take steps 
either in the design or analysis of a study to 
ensure comparability in measurements on sub¬ 
jects with and without disease. For example, 
when multiple interviewers are used in a 
case-control study, each case and his or her 
matched controls may be interviewed by the 
same interviewer and at about the same time to 
equalize, between cases and controls, interviewer 
error and any error in response that may correlate 
with time. Similarly, when proxy respondents are 
necessary for some subjects (usually because they 
died before an interview could be obtained), a 
proxy may also be used for the matched controls, 
or the analysis may be stratified on whether the 
subject or a proxy was interviewed (Walker et al., 
1988). Such strategies will sometimes reduce bias 
in the effect measures. As a general rule, if mis- 
classification is non-differential within levels of a 


covariate but varies across those levels, the valid¬ 
ity of the effect measure will be improved by con¬ 
trol of the covaiiate. In other circumstances, con¬ 
trol of a covariate related to measurement error 
may increase bias in the effect measure 
(Greenland & Robins, 1985). 

Walker et al. (1988) have given a hypothetical 
example of the effect of control of respondent sta¬ 
tus-self or proxy—in a case-control study of the 
relationship between smoking and a disease. 
They assumed that proxy responses were 
obtained for 50% of cases and 10% of controls, 
and used the matrix of proxy-reported versus 
self-reported smoking habits obtained by Rogot & 
Reid (1975). The results are summarized in Table 
4.6. The effect of the differential measurement 
error Introduced by the two different sources of 
exposure measurements is to slightly increase the 
relative risk of disease in smokers of more than 
one pack per day and to alter the shape of the 
dose-response relationship at the bottom of the 
distribution of cigarette use. Adjustment for 
respondent type corrects both of these biases, but 
increases slightly the width of the 95% confi¬ 
dence intervals about the relative risk estimates 
because of maldistribution of controls between 
the two groups of cases of equal size in the strati¬ 
fied analysis. 


of adjt .. . .. , 

proxy respond6nts ’ ^^^^^ii^^g^ > ^ 


Curran* smoking habits 





Nort-sirioker 

Occasional 

smoker 

< 1 pack per day 

1 pack per day 

>1 pack per day 

; •" f , ■ 



RR 

95% Cl 

RR 

95% Cl 

RR 

95% Cl 

RR 

95% Cl 
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t(95%Ci) c 

True relative 
risks 

1.0 

1.4 

C.5-3.5 

2.0 

14-2.9 

3.0 

2.1—4.3 

3.5 

2.3-5.3 

3(0.61-1.12) 

Crude observed 
relative risks 

1.0 

2.2 

i. 0-4.7 

1.7 

12-2.5 

2.9 

2.1-42 

3.7 

2.5-S.6 

5 <0.65-1.11) 

Observed reiativo 

1.0 

1.5 

Q.Q-3,6 

2.0 

1.3-3,1 

2.9 

2.0-4.3 

3.3 

2.1-5.2 


- from food {/t^uency ques* 


risks adjusted 

far respondent - , 

1 . B3se;t on data from Walker e: al. (1988). Hypolnsisal essmpla eta case-control study of the relationship between a disease and current 
smulmg habits in which 60% nf twposure measurements in cases came from proxy respondents but only 10% in controls, 
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4.5.4 Resources for the control of exposure mea- some circumstances be more efficient to repeat 
surement error the planned measure in all subjects and reduce 

Strategies to control the effects of exposure the total sample size than to measure only once 
measurement error without actually preventing it (Phillips & Davey Smith, 1993). 
can reduce the effects of both bias and random 

error on the dose-response relationship and 4.6 Error in the measurement of outcome 
therefore lead to more accurate QEP. None of In cancer epidemiology, the outcome of inter- 
these strategies is perfect, however, and it is est is the occurrence of cancer in the subjects 

unlikely that use of any or all of them will elimi- under study. However, although easy to concep- 

nate the effects of error entirely. In addition, two tualize, occurrence of cancer is a difficult event to 

of them, namely the use of multiple measure- measure. As a result, the event actually measured 

nients of exposure and adjustment of results with may be more or less closely related to it. 
reference to a validity or reliability study, are The concept of occurrence of cancer is ambigu- 

expensive and may be difficult in practice or, in ous. For example, how one considers a case of 

some cases (e.g. most historical cohort studies), clinically silent cancer detected at necropsy will 

impossible, depend on the context of the study: it will be 

The allocation of resources to preventing or counted as a case in a study on the validation of 

controlling the effects of exposure measurement death certificates by way of necropsy records but 

error may well be cost-effective given the conse- will be ignored in a study based solely on death 

quences of measurement error on the numbers of certificates, and one could argue about its inclu- 

subjects needed to detect an effect or estimate it sion in a population-based case-control study, 

with some given precision. Freedman et at., Even the most straightforward definition of a case 

(1990) estimated the effects of error in the mea- of incident cancer, based on medical records 

surement of dietary fat intake on the numbers of reporting the results of diagnostic procedures and 

subjects required in a cohort study to detect, with clinical reasoning, may be subject to ambiguity 

90% power at p<0.05, relative risks of colorectal since medical procedures and diagnostic criteria 

cancer of up to 2.3 over five levels of proportion differ among medical centres and among patients 

of dietary calories derived from fat. If the corre- referred to the same centre (for a discussion on 

lation in the cohort between the measured and the comparability of data in cancer registries, see 

true proportion of dietary calories derived from Parkin & Muir, 1992). 

fat was 0.65, they showed that a cohort of The most widely used outcome variables in 
141 000 subjects with an expected incidence of cancer epidemiology, namely death from cancer, 

colorectal cancer of 200 per 100 000 person-years diagnosis of cancer, diagnosis of preneoplastic 

followed for five years would be required. The lesions, and molecular markers of early carcino- 

number required if the true exposure could be genic effect, can be seen as consecutive steps in a 

measured was nearly 10-fold less, namely 16 000. process corresponding to the putative natural his- 

To what should resources for (he prevention or tory of the disease (for a review, see MacMahon & 

control of the effects of measurement error be Pugh, 1970; Hulka et ai, 1990; and section 4.2.3). 

allocated? In general, it appears that, when the 

proposed measure of exposure is reasonably accu- 4.6.1 Death from cancer 

rate, a validation study on a sample of the study A fundamental source of error in outcome 
population, and use of its results to correct tliie ascertainment is incorrect specification of vital 

risk estimates obtained in the whole population, status. In particular, deaths may be missed when 

is the most cost-effective approach. When the death records are incomplete. For example, in 

proposed measure is very inaccurate, it may be studies involving record linkage with mortality 

more efficient simply to use the criterion mea- records, vital status may be incorrectly specified 

surement proposed for the validation study as the when personal identifiers are insufficient to 

measure to be applied in the whole population ensure that no linkage errors occur. Linkage 

(Greenland, 1988; Spiegelman & Gray, 1991). errors lead to bias and additional uncertainty in 

Where there is no criterion measure, it may in commonly used statistics such as standardized 


Measurement of Exposure ar.i 
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specification of vital 
s may be missed when 
iete. For example, in 
inkage with mortality 
e incorrectly specified 
s are insufficient to 
rrors occur. Linkage 
litional uncertainty in 
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mortality ratios (SMRs) and relative risk regres¬ 
sion coefficients, with an excess (deficit) of deaths 
leading to upward (downward) bias in the esti¬ 
mated SMRs, 

Information on death from cancer is based 
mainly on death certificates, which are issued for 
legal reasons in many countries. Additional 
sources of information on death from cancer may 
include medical records and interviews with next- 
of-kin (see, e.g. Chen etal, 1990). Most epidemi¬ 
ological studies of occupational cancer risk are 
based on mortality data from death certificates. 

The main advantage of the use of death certifi¬ 
cates is their availability in many populations and 
for relatively long periods, allowing long-term fol¬ 
low-up of large groups of exposed people and easy 
derivation of reference rates in population-based 
studies. Among the problems associated with 
their use are limitations in the data obtained from 
death certificates resulting, e.g. from differences 
in survival in different populations, failure to 
report cancer in patients dying ffom other causes, 
and limitations in the accuracy of the certificates 
themselves, including, in particular, errors in the 
reported cause of death and in the coding process. 
In a study of 1243 death certificates mentioning 
cancer, which were coded independently in nine 
i countries, the proportion of certificates coded 
| with cancer as the underlying cause of death var- 
' ied from 88% to 96% (Percy & Muir, 1989). 

! 4.6.2 Incidence of cancer 

j Data on cancer incidence are usually derived 
! from established systems such as cancer registries 
reporting routinely all cases occurring in a partic- 
I ular population (see, e.g. Parkin et ah, 1992), or 
from ad hoc surveys of medical or other records 
j (e.g. in a hospital-based case-control study). The 
Use of data from high-quality cancer registries has 
several theoretical advantages, such as ensuring 
control of the quality and completeness of regis¬ 
tration and minimizing diagnostic misclassifica- 
tion (Parkin & Muir, 1992). However, quality of 
cancer registration is often unsatisfactory, and 
' cancer registration covers less than 400 million 
I individuals worldwide (Parkin et ah, 1992). 

: Studies using data on cancer incidence derived 
j bom ad hoc surveys are subject to additional 
' problems because of their potential incomplete- 
\ ness, which may lead to bias with respect to the 


assumed population base, and because the infor¬ 
mation derived from sources other than medical 
records may not be correct (see, e.g. Stcenland & 
Schnorr (1988) on diagnoses reported by next of 
kin). 

One additional problem related to the use of 
data on cancer incidence is the fact that the 
degree of medical surveillance may vary among 
populations and groups, and thus lead to variable 
ascertainment of the true underlying incidence. 
Examples of such surveillance bias may be found 
in cancer patients followed up for incidence of 
secondary leukaemia (Kaldor et ai, 1990) and 
occupational groups undergoing periodic medical 
examinations (Hiatt etal., 1993). 

Whether the cases of cancer included in a 
study are obtained from mortality statistics, from 
a cancer registry or from an ad hoc survey (e.g. 
from the of pathology department of a hospital), 
measurement error can be introduced in the diag¬ 
nosis of the neoplasm. In particular, there can be 
a diagnostic error when the true disease is not 
diagnosed, and a histopathological error when 
the measured histopathological class is not the 
true class. It should be noted that histopatholog¬ 
ical error is not necessarily less serious than diag¬ 
nostic error: the inclusion of benign lesions In a 
series of malignant tumours, and similarly the 
exclusion of malignant lesions misdiagnosed as 
benign tumours, will have very important impli¬ 
cations in a study on the quantification of the 
effect of a risk factor. 

These errors can either be systematic or 
non-systemaric, as discussed earlier for 
exposure measurement errors. Thus a systematic 
histopathological error exists, e.g. when a pathol¬ 
ogist has a consistent tendency to call a benign 
melanocytic naevus a level 1 melanoma. 

The major difficulty in assessing diagnostic or 
histopathological validity is the lack of a gold 
standard of the characteristics being measured 
In other words, there is no way to determine the 
absolute truth with respect to the disease at the 
level of organ, tissue and cell type. For this rea¬ 
son, measurements of diagnostic and histopatho¬ 
logical error have usually focused on reliability 
rather than validity, and have measured intraob¬ 
server and interobserver reliability. 

Validity and reliability are related in the sense 
that valid measurements are also reliable. 
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Measurement of Exposure an— 


Reliable measurements, however, are not neces¬ 
sarily valid, and the only certain conclusion is 
that an unreliable measurement is also invalid. 
Intraobserver reliability tends to be greater than 
the corresponding interobserver reliability 
because the single observer will repeat his or her 
own systematic error from observation to obser¬ 
vation. Thus, the comparison of repeated mea¬ 
surements by one observer will be free from the 
effects of systematic error, whereas the often part¬ 
ly different systematic errors of different 
observers will be included in the assessment of 
interobserver agreement. Because systematic 
error is an important component of outcome 
measurement error, a measurement of interob¬ 
server reliability will be a better guide to validity 
(or at least to its upper limit) than a measurement 
of intraobserver reliability, 

4.6.3 Preneoplastic lesions 

Although there are many examples of preneo¬ 
plastic lesions identified and measured in human 
populations, more important from the point of 
view of quantitative associations are those in 
which the preneoplastic lesions are used as surro¬ 
gates for the cancer itself. In such cases, a quanti¬ 
tative estimation of risk can be made on the basis 
of the presence, or level, of the preneoplastic 
lesion. For example, melanocytic naevi are strong 
predictors of malignant melanoma of the skin 
(Armstrong & English, 1988), and genital warts 
and other manifestations of infection with human 
papilloma virus (HPV) are strongly associated with 
the risk of cervical cancer (Brinton, 1992). 

Measurement of pieneoplastic lesions, howev¬ 
er, is also subject to random and systematic error. 
For example, in a study oil the prevalence uf 
melanocytic naevi in Western Australia in chil¬ 
dren assessed by two nurses, it was found that 4% 
of the variation in numbers of naevi of all sizes, 
and 8% of the variation in those of naevi of at 
least 2 mm in diameter was due to interobserver 
variation (English & Armstrong, 1994). Detection 
of HPV infection strongly depends on the method 
used, and recently developed methods based on 
the polymerase chain reaction have improved the 
sensitivity of the assay (Schiffman et al, 1991) 
and led to a reassessment of the quantitative 
aspects of the association between HPV infection 
and risk of cervical cancer (Munoz et al, 1992). 
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4.6.4 Molecular markers of early carcinogenic 
effect 

More recently, biological markers of early 
effects in the carcinogenic pathway have been 
used hi studies in human populations (Perera & 
Weinstein, 1982). Such markers include non-spe¬ 
cific lesions of the genome, such as sister chro¬ 
matid exchanges (Wilcosky & Rynard, 1990), 
micionuclei (Vine, 1990) and chromosome aber¬ 
rations (Schwartz, 1990), as well as more specific 
effects, such as activation of proto-oncogenes or 
inhibition of tumour-suppressor genes (Perera & 
Santella, 1993). In addition, biomarkers are now 
widely used to improve assessment of the biolog¬ 
ically effective dose and to evaluate the role of 
individual susceptibility, it is clear, however, 
that when such methods are used, the tradition¬ 
al distinction between exposure and outcome 
becomes less clear, and the same event, such as 
the formation of a DNA adduct or the mutation 
of a relevant gene, can be considered either as a 
measure of exposure or as an effect. 

Biomarkers of early carcinogenic effect may 
contribute greatly to knowledge of aetiological 
agents and the mechanism of tumour formation 
In humans (for review, see Hulka et al, 1990; 
Perera & Santella, 1993). In particular, from the 
point of view of quantitative estimation and 
prediction of risk, they may help to improve the 
classification of neoplastic lesions according to 
the genes that have been altered and the muta¬ 
tions in such genes, which may be risk- 
factor-specific. For example, it is possible to dis¬ 
tinguish "spontaneous' 1 mutations in the pS3 
tumour-suppressor gene from those caused by 
carcinogens such as benzo[fl]pyiene or aflatoxin 
Bj (Hollstein et al., 1991; Jones et al., 1991). 
There are also mutations of the p53 gene in skin 
cancer and normal skin which are almost com¬ 
pletely specific to exposure to UV radiation 
(Ziegler etai, 1993; Nakazawa et al, 1994). 

In addition, these biomarkers may help to 
identify critical steps in the carcinogenic 
process, and therefore to contribute to the defi¬ 
nition of biologically relevant models of car¬ 
cinogenicity' to be used in quantitative human 
risk assessment (Perera et al., 1991; Hattis & 
Silver, 1993). The technical variability of bio¬ 
markers has been discussed by Vineis et al 
(1993). 


4.6.5 Sources, measurement 
error in the measurement of c 
The observer is not the only 
the measurement of outcomf 
which can be classified as "pre 
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examples of these sources 
histopathologicai classificati 
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State of preservation cf tissue at tK 
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I rauma to tissue in Hie course of s- 

Fresnvat on and fixation of.tissue ■ 

Sampling of tissue section to be mor 

Preparation and processing of tissue 

Quality and martensite© Ot ffllcr'.'-j 
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Observer sources: 

Familiarity with this "static ot the ait" 
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Sampling oMidre of the section for i 
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irird, pressure of time) 

Application of the classification rules ’ 
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f Recording and transcription - 

^ Coding .. - 

v Data processing 
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4.6.5 Sources, measurement and prevention of 
error in the measurement of outcome 
The observer is not the only source of errors in 
the measurement of outcome in epidemiology, 
which can be classified as "preobservei", "observ¬ 
er", and "postobserver", 'Table 4.7 provides some 
examples of these sources of error in the 
histopathological classification of neoplastic 
lesions. The preobserver sources of error are impor¬ 
tant in relation to interobserver reliability and to 
validity in general, because they may contribute to 


measurement error by making the observer's task 
more difficult or increasing the likelihood of dif¬ 
ferent biases between observers, and may also 
increase the level of variation seen between 
observers in a formal study of interobserver relia¬ 
bility (e.g. examination of different sections by dif¬ 
ferent observers gives rise to the possibility of dif¬ 
ferent errors in sampling from the tissue block and 
different quality of preparation of the mounted, 
stained section). A bibliography on observer vari¬ 
ability has been published (Feinstein, 1985). 


Table^T^ExaropfestrfaaUttS^ histolocjfcalclassification:o{icanc^P 


Preobserver sources: 

State of preservation o) tissue at the time of sampling 
Location Of biopsy (right cito ; right tissue) 

Sampling of tissue for preservation and fixation 
Trauma to tissue in the course of sampling or biopsy 

Preservation and fixation of tissue - 

Sampling of tissue section to be mounted and stained 

Preparation and processing of tissue section ready for examination • • 

Quality and maintenance of microscope 

“State cf the arf of measurement of histopatnological characteristics rolovai it to trie tissue sample in question 

Observer sources; 

Familiarity with the “stale of the arf of measurement ot histopathological characteristics relevant to the tissue sample in ques¬ 
tion 

Samplins of fletdsof the section for cfose examination ■- 

Thoroughness in making the observations nscessaiy to classify the (issue (influenced by factors such as personality, state of 
mind, pressure of timo) 

Application of the classification rales implied in tho “stats cf the arf 

Postobserver sources: 

Recording and transcription 
Coding 

Data processing 
Communication, 

Integration of observation:; nnri classficaticn rules into a diagnosis 
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Source: https://www.industrydocuments.ucsf.edu/docs/txxj0001 
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Table 4.8 Requirements for the conduct of' studies 

• • <*■ . • .. .. r ± - • ■•••«'•% • «. / in viy riWVinTWit 

in histopathology r . 

. c,- • ., i 


1. A representative set of sections ofttie cancels) tobestudied. IdeaJfy tHssetwiilajme frc<n the rea! ife. practice of 

pathology in a whole population so that the results will reflect what may actually happen in practice rather than what can 
be achieved in ideal circumstances •' '•* ' .• 

2. Two cr more willing pathologist. Again, ideally ihese pathologists will be involved In the genera! practice of histopathology 
of cancer rather than be particularly specialized in the cancer of interest 

3. Circulation of the sections to be studied, in random order and with only nume real identification, to each pathologist who 
examines than without knowledge ot the other pathologists’ opinions o! liism. Ideally, because of tine problem of sampling 
ot the tissue block when taking a section and variation in the quality of the subsequent processing of the section, each 
pathologist should sec tho same seclion 

4. Standardized recording or a specially prepares form of the hlstopsihologica! obsecrations ol interest. An example of a 
form used in a study of inlerobuciver variation in measurement c* the histopathoiogioa! characteristics of malignant 

melanoma is given in Figure 4.3 


Interobserver reliability can be measured by 
means of ad hoc studies involving repeated inde¬ 
pendent measurements of the outcome based on 
the same, or at least very similar, material. It is 
obvious that in some instances a correct experi¬ 
mental design of the reliability study is not possi¬ 
ble, since the measurement condition cannot be 
replicated (e.g. diagnosis based on macroscopic 
necropsy evidence). Some of the requirements for 
the conduct of a study of interobserver reliability 
in the histopathology of cancer are summarized in 
Table 4.8. 

Several statistical methods have been proposed 
for use in measuring interobservei reliability. The 
simplest and most commonly used method is to 
determine the proportion of cases in which two 
(or more) observers agree. However, this method 
does not take into account the proportion of cases 
in agreement by chance only. It can be simply 
shown that this proportion varies with the preva¬ 
lence of the outcome being measured, increasing 
as the prevalence increases to very high or decreas¬ 
es to very low values. A simple statistic corrected 
for chance agreement is the kappa statistic (Fleiss, 
1971), which is the ratio of the difference between 
the observed proportion in agreement (p g ) and the 
proportion of agreement that can be expected by 
chance (p c ) to the proportion of agreement that 
cannot be attributed to chance (1-pj, so that: 

kappa =- 


An advantage of the kappa statistic is that is can 
be applied to nominal scale variables, which are 
common in pathology and histology. It can be 
interpreted like a correlation coefficient, i.e. its 
value varies from 0.0 (no agreement beyond that 
expected by chance) to 1.0 (perfect agreement) and 
to -1.0 (complete disagreement). While no strict 
rules exist, it has been stated that values of kappa in 
excess of 0.75 indicate very good agreement, and 
values below 0,40 indicate poor agreement (Landis 
& Koch, 1977). More information on the kappa sta¬ 
tistic and on its limitations can be found in special¬ 
ized papers (Maduie & Willett, 1987; Mazoyer & 
Mary, 1987; Dunn, 1989). 

Alternatives to the kappa statistic include the 
intraclass correlation coefficient, which is less sen¬ 
sitive to changes in the number of categories, and 
the Pearson correlation coefficient (Snedecor & 
Cochran, 1980). 

In principle, error in outcome measurement can 
be prevented or reduced by dealing with the sources 
discussed. While every effort should be made to 
minimize the error in any single set of measure¬ 
ments, error can be reduced further if multiple mea¬ 
surements are made. In practice, this means the use 
of multiple observers. The measurement to be 
taken as correct when multiple observers are used 
may be the majority position of the observers. 
There is value, however, in having the observers 
meet after they have made their measurements, dis¬ 
cuss their findings, and reach a consensus position 
on the value of the measurement (Kraemer, 1992)- 


1. SERIAL NUMBER: 

2. TUMOUR WIDTH & BREAi 
MILLIMETERS: 

(2 Largest diameters) 


L Total lesion □ I 
ii. Nodular portion □ I 


3. CROSS-SECTIONAL PROEM 

1 Hat 

2 Dome 

3 Polypoid 

4 Plateau 

5 Verrucous 

4. HISTOLOGY: 

(a) Premalignant tntraepid 
without invasion 

1 HMF type 

2 SSMtype 

3 Mixed 

4 Other (specify below) 

(b) Premalignant intraepid 
involvement adjacent to in 
tumour 

1 HMF type 

2 SSMtype 

3 Mixed 

4 Other (specify below) 

5 None • 

(c) Cell type predominatin' 

1 Epithehoid 

2 Spindle 

3 Naevus cell-like 

4 Mixed ) 

(d) Degree of pigmentation 

1 Heavy I 

2 Moderate to light 


(e) Pre-existing or associate 
benign melanotic lesion 


Figure 4.3. Form used ior recording < 
measurement of features of the fiistoj 
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1. SERIAL NUMBER; 


MELANOMA PROJECT PATHOLOGY ABSTRACT 

□ □□□ 4 (f) Chronic inflaimnatoi 


: real life practice of 
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v tha problem of sampling 
; go?the section, each 


.■rest. AneA-ampfeofa . 
istics of malignant 


pa statistic is that is can 
le variables, which aie 
! histology. It can be 
on coefficient, i.e. its 
greement beyond that 
perfect agreement) and 
tent). While no strict 
that values of kappa in 
' good agreement, and 
oor agreement (Landis 
ation on the kappa sta- 
an be found in special- 
iett, 1987; Mazoyer & 

i statistic include the 
■-‘nt, which is less sen¬ 
der of categories, and 
efficient (Snedecor & 

une measurement can 
‘aling with the sources 
t should be made to 
ingle set of measure- 
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ce, this means the use 
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2. TUMOUR WIDTHS BREADTH IN 
MILLIMETERS: 

(2 largest diameters) 


i. Total lesion □ □□□lO 

ii. Nodular portion 

3. CROSS-SECTIONAL PROFILE: 

1 Flat 

2 Dome 

3 Polypoid 

4 Plateau 

5 Verrucous Cil7 

4. HISTOLOGY: 

(a) Premalignant intraepidermal lesion 
without invasion 

1 HMFtype 

2 SSM type 

3 Mixed 

4 Other (specify below) n« 

(b) Premalignant intraepidermal 
involvement adjacent to invasive 
tumour 

1 HMF type 

2 SSM type 

3 Mixed 

4 Other (specify below) I—i 

5 None I—119 

(c) Cell type predominating 

1 Epithelioid 

2 Spindle 

3 Naevus cell-like 

4 Mixed O 20 

(d) Degree of pigmentation 

1 Heavy 

2 Moderate to light 


(e) Pre-existing or associated 
benign melanotic lesion 


(f) Chronic inflammatory reaction 

i. Intraepidermal component 

1 Severe 

2 Moderate 

3 Mild 


ii. Invasive component 

1 Severe 

2 Moderate 

3 Mild 


(g) Evidence of regression 
1 Yes 


(h) Mitotic activity in invasive 
component (HPF=high power field x 400) 

1 Less than 1 per 5 HPFs 

2 Between 1/HPF and 1/5 HPFs 

3 Greater than or equal to 1 per HPF L 

(i) Solar elastosis 

1 Severe or moderate 

2 Mild 

3 Absent t- 

(j) Ulceration 
1 Yes 


5. LEVEL OP INVASION; 

1 Intraepidermal 

2 Papillary dermal 

3 Papiilaiy-reticular interface 

4 Reticular dermal 

5 Subcutaneous fat □ 30 

6. TUMOUR DEPTH IN MILLIMETERS: 

□ □.□□34 

7. PATHOLOGIST r—, 


Figure 4.3. Form used for recording each pathologists observations on each case in an interobserver reliability study of the 
measurement of features of the histopathology of malignant melanoma. Based on Heenan et al. (1984). 
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4.6.6 Effect of error in measurement of outcome 
on the dose-response relationship 
The effect of outcome measurement error on 
the dose-response relationship is conceptually 
similar to the effect of exposure measurement 
error. When the outcome being measured is death 
from, or incidence of cancer, the outcome variable 
is invariably categorical, usually binary (i.e., pres¬ 
ence or absence of cancer). If misdassifkahott of a 
dichotomous outcome variable is independent of 
misdassification of exposure, the bias is always 
towards the null. In practice, however, misclassifi- 
cation of exposure and outcome might be correlat¬ 
ed, resulting in a bias in an unpredictable direc¬ 
tion, even with nan-differential misdassification 
(Kristensen, 1992; Brenner etal., 1993). When out¬ 
comes other than cancer are considered, there may 
be multiple categories, ordered, e.g. according to 
the severity of the disease (e.g. healthy, in situ car¬ 
cinoma, invasive carcinoma) or reflecting multi¬ 
plicity of a particular outcome in individual sub¬ 
jects (e.g. number of melanocytic naevi). The mea¬ 
surement of markers of early carcinogenic effect 
has added further complexity (Schulte etal., 1993). 
While such markers are usually measured on two 
levels, it is now possible to identify several out¬ 
come categories based on combinations of them. 
For example, in a hypothetical study based on the 
measurement of cancer and a marker of early 
effect, such as activation of a proto-oncogene, four 
outcome categories can be considered: (1) healthy 
subjects without the activation; (2) healthy sub¬ 
jects with the activation; (3) cancer patients with¬ 
out the activation; and (4) cancer patients with the 
activation, fa this example, categories 1, 2 and 4 
can be seen as successive steps in the carcinogenic 
process, in which the presence of the early lesion is 
a required step. The implications of measurement 
error in studies based on more than two categories 
of outcome have not been investigated in any 
detail, and the effect of their misdassification on 
the risk estimate will be difficult to predict. 

The major issue in outcome measurement error 
is whether the error is differential or non-differen¬ 
tial across exposure categories. One example of 
differential measurement error is a cohort study in 
which the population under investigation (con¬ 
sidered as exposed) is compared with a reference 
(e.g. national) population, considered as unex¬ 
posed, where the members of the exposed cohort 

96 ir¬ 


ate more likely to be diagnosed with the cancer of 
interest than the reference population (see discus¬ 
sion on surveillance bias above). 

In the case of non-differential error in measure¬ 
ment of outcome, by analogy with the effects of 
error in measurement of exposure (see section 
4.3), the observed relative risk (Ell) will be biased 
towards the null value relative to the true RR (and 
the dose-response curve will be flattened) provid¬ 
ed that the probability that the measurement clas¬ 
sifies a true case as a case is the same as, or greater 
than that with which it classifies a true non-case 
as a non-case. On the other hand, differential j 
measurement error can cause a bias in any direc- j 
tion, and may even reverse the direction of the 
association (e.g. RR below 1 instead of above 1). 

A good example of the effect of outcome mea¬ 
surement error deals with the risk of soft-tissue 
sarcomas (STS) among workers exposed to dioxin 
(see also Chapter 8). STS are rare malignant neo¬ 
plasms originating in tissues of mesodermal ori¬ 
gin, such as fat, muscle and connective tissue. 
However, although most STS originate from 
organs primarily containing mesodermal tissue, 
some originate from visceral organs, such as the 
stomach, that also contain mesodermal tissue, but 
such sarcomas represent only a small fraction of 
all neoplasms originating in these organs. Since 
the International Classification of Diseases (ICD), 
which is used in many countries to code causes of 
deaths from death certificates, is based on the 
topography of the neoplasms, STS arising from 
visceral organs are coded as neoplasms of those 
organs, and are therefore not identified in a mor¬ 
tality study, which will be based on the ICD cate¬ 
gory for malignant neoplasms of connective and 
other soft tissues. The net result is a 30-70% i 
underestimation of the number of STS (Lynge et I 
at, 1587; Erikson & Gezelius, 1990). j 

Exposure to dioxin has been related to an I 
increased risk of STS in a number of cohort studies , 
in which the mortality of the exposed workers has 
been compared with the mortality of the national 
population (see section 9.2). In particular, a study 
of over 6000 workers in the USA has found a three¬ 
fold excess mortality from STS, based on four i 
observed and 1.2 expected cases, although the 1 
result was not statistically significant (Fingerhut et j 
«i, 1991). A careful examination of all death cer- i 
tiffcates of deceased cohort members and of the i 


medical records of all those dy'i 
sites where STS are likely to oa 
additional cases of STS (Sun 
Therefore, three 1 out of seven 5 
missed in the analysis based oi 
The degree of misdassificatioi 
population used in this study 
one can reasonably assume tha 
tion in tiie original comparison 
rial. The RR of 3.3 (4/1.2), altb 
then less precise than it wouli, 
been possible to use all the cas- 
in the cohort in its estimatio, 
interval based on four cases, 
seven cases, 1.4-6.9). A narrow; 
val would allow a more precis; 
mation of risk based on these n 
parison of the seven cases been i. 
ulation expected value of 1.2, i 
bly have been differential error 
surement, with the RR biased av 

4,7 Conclusions 

Errors in measurement of ; 
come can have substantial and 
unpredictable effects on the di 
of dose-response relationship 
error can be prevented by car 
design of measurement instrn 
quality control in their use an< 
and analysis of the measurem 
prevented, it can be controlled 
the use of multiple measures of 
come, the adjustment of dose 
ships for the effects of measurer 
control of covarlates related to > 
While the application of the S' 
sometimes make the effects of i 
negligible in QEP, this is probat 

There are a numbeT of area 
could lead to better measurer, 
and outcome for QEP. First, dev 
rate biological techniques to m 
effective dose may produce esi 
exposure (e.g. DNA adducts) 
exposure (e.g. exposure-specifl 
are more accurate than any oi, 
of exposure currently available 


1 Two were not STS. 
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losed with the cancer of medical records of all those dying from cancers in first, however, is limited by the short exposure 

■ population (see discus- sites where STS are likely to occur identified three period that they cover, and the use of the second 

!, 0 ve). additional cases of STS (Suruda et ah, 1993). by the fact that they can give essentially no infor- 

tential error in measure- Therefore, three 1 out of seven STS cases (43%) were mation about dose rate or pattern of exposure. To 

logy with the effects of missed in the analysis based on death certificates. evaluate these problems and their significance, it 

exposure (see section The degree of misclassification in the reference would be desirable to incorporate in large cohort 

risk (RR) will be biased population used in this study is not known, but studies the periodic collection of biological sam- 

iive to the true RR (and one can reasonably assume that the misclassifica- pies for recurrent measurement of present expo- 

dll be flattened) provid- Hon in the original comparison was non-differen- sure (to obtain information on dose rate, pattern 

t the measurement cias- tial. The RR of 3.3 (4/1.2), although not biased, is of exposure and, ultimately, cumulative exposure ; 

i the same as, or greater then less precise than it would have been had it over some period of time) and the collection, on a 

'assifies a true non-case been possible to use all the cases of STS identified case-control basis within a cohort study, of Mo¬ 
ther hand, differential in the cohort in its estimation (95% confidence logical measurements of cumulative exposure, ide- 

use a bias in any direc- interval based on four cases, 0.9-8.7; based on ally corresponding in some way to the measure- 

■ie the direction of the seven cases, 1.4-6.9). A narrower confidence inter- ments of present exposure. 

' instead of above 1). val would allow a more precise quantitative esti- Second, the development, as measures of out- 

iffect of outcome mea- mation of risk based on these resuLts. Had a com- come, of markers of early effects which are impor- 

the risk of soft-tissue parison of the seven cases been made with the pop- tant in the process of cancer development may 

kers exposed to dioxin ulation expected value of 1.2, there would proba- lead to studies that can be carried out sooner after 

re rare malignant neo- bly have been differential error in outcome mea- exposure, more quickly, and with greater statisti¬ 
cs of mesodermal ori- surement, with the RR biased away from the null. cal power than those based on cancer incidence or >-• 

nd connective tissue. mortality. The best way to investigate the signifi- 

STS originate from 4.7 Conclusions cance and role of markers of early effect would be 

g mesodermal tissue, Errors in measurement of exposure and out- to make repeated collections of biological samples 

il organs, such as the come can have substantial and, to a large degree, in prospective studies, and to cam' out large sur- 

resodermai tissue, but unpredictable effects on the direction and shape veys of these markers in high-risk populations s 

iy a small fraction of of dose-response relationships. Measurement with well characterized exposures. j 

l these organs. Since error can be prevented by careful selection and Finally, there would be value in a thorough the- / 

ion of Diseases (ICD), design of measurement instruments, and strict oretical and empirical evaluation of the effects of , 

tries to code causes of quality control in their use and in the processing measurement error on dose-response reiation- 

tes, is based on the and analysis of the measurement results. If not ships and QEPs derived from them. The theory 

ns, STS arising from ! prevented, it can be controlled to some degree by has been worked out only recently and, as yet, 

■ neoplasms of those the use of multiple measures of exposure and out- incompletely. Further substantial development 

t identified in a mor- come, the adjustment of dose-response relation- and the exploration of its application to some of 

sed on the ICD cate- ships for the effects of measurement error, and the the more comprehensive data on dose-response in 

is of connective and control of covariates related to measurement error, cancer epidemiology are possible, 

result is a 30-70% t While the application of these procedures may 

ber of STS (Lynge et \ sometimes make the effects of measurement erroE References 
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5.1 General principles 
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